Dubrovnik
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
- (10 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.46)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California (0.14)
- Europe > Italy > Tuscany > Florence (0.04)
- (12 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Dominican Republic (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- North America > Dominican Republic (0.04)
- Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Research Report > Experimental Study (0.74)
- Research Report > New Finding (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.82)
- Europe > France (0.15)
- Europe > United Kingdom (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- (28 more...)
- Transportation > Passenger (1.00)
- Transportation > Marine (1.00)
- Leisure & Entertainment > Sports > Football (1.00)
- (4 more...)
- North America > Canada > Quebec > Montreal (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- (2 more...)
- Asia > South Korea (0.05)
- Oceania > Australia (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (14 more...)
- Leisure & Entertainment (1.00)
- Education (0.68)
- Media > Television (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)
- Information Technology > Hardware (0.93)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > South Korea > Seoul > Seoul (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- (2 more...)
- North America > United States (0.14)
- Asia > Singapore (0.04)
- Europe > Switzerland (0.04)
- Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
A Unified Definition of Hallucination, Or: It's the World Model, Stupid
Liu, Emmy, Gangal, Varun, Zou, Chelsea, Huang, Xiaoqi, Yu, Michael, Chang, Alex, Tao, Zhuofu, Kumar, Sachin, Feng, Steven Y.
Despite numerous attempts to solve the issue of hallucination since the inception of neural language models, it remains a problem in even frontier large language models today. Why is this the case? We walk through definitions of hallucination used in the literature from a historical perspective up to the current day, and fold them into a single definition of hallucination, wherein different prior definitions focus on different aspects of our definition. At its core, we argue that hallucination is simply inaccurate (internal) world modeling, in a form where it is observable to the user (e.g., stating a fact which contradicts a knowledge base, or producing a summary which contradicts a known source). By varying the reference world model as well as the knowledge conflict policy (e.g., knowledge base vs. in-context), we arrive at the different existing definitions of hallucination present in the literature. We argue that this unified view is useful because it forces evaluations to make clear their assumed "world" or source of truth, clarifies what should and should not be called hallucination (as opposed to planning or reward/incentive-related errors), and provides a common language to compare benchmarks and mitigation techniques. Building on this definition, we outline plans for a family of benchmarks in which hallucinations are defined as mismatches with synthetic but fully specified world models in different environments, and sketch out how these benchmarks can use such settings to stress-test and improve the world modeling components of language models.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Ohio (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (4 more...)
- Research Report (0.50)
- Personal > Honors (0.47)